Search CORE

187 research outputs found

An ontology enhanced parallel SVM for scalable spam filter training

Author: Bauer
Blanco
Blanzieri
Blei
Breiman
Cao
Caruana
Chawla
Colas
Cristianini
Dean
Do
Gansterer
Godwin Caruana
Graf
Hall
Huang
Kearns
Kim
Maozhen Li
Mei
Platt
Suykens
Taura
Vapnik
Wang
Woodsend
Yang Liu
Zanghirati
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/05/2013
Field of study

This is the post-print version of the final paper published in Neurocomputing. The published article is available from the link below. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. Copyright @ 2013 Elsevier B.V.Spam, under a variety of shapes and forms, continues to inflict increased damage. Varying approaches including Support Vector Machine (SVM) techniques have been proposed for spam filter training and classification. However, SVM training is a computationally intensive process. This paper presents a MapReduce based parallel SVM algorithm for scalable spam filter training. By distributing, processing and optimizing the subsets of the training data across multiple participating computer nodes, the parallel SVM reduces the training time significantly. Ontology semantics are employed to minimize the impact of accuracy degradation when distributing the training data among a number of SVM classifiers. Experimental results show that ontology based augmentation improves the accuracy level of the parallel SVM beyond the original sequential counterpart

Crossref

Brunel University Research Archive

Inferring short-term volatility indicators from Bitcoin blockchain

Author: A ElBahrawy
D Garcia
D Kondor
D Kondor
D Yermack
I Eyal
I Meilijson
I Vodenska
J Donier
J Suykens
K-K Kleineberg
L Kristoufek
M Möser
P Glasserman
RH Keshavan
X Huang
X Huang
Y Sakamoto
YB Kim
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 19/09/2018
Field of study

In this paper, we study the possibility of inferring early warning indicators (EWIs) for periods of extreme bitcoin price volatility using features obtained from Bitcoin daily transaction graphs. We infer the low-dimensional representations of transaction graphs in the time period from 2012 to 2017 using Bitcoin blockchain, and demonstrate how these representations can be used to predict extreme price volatility events. Our EWI, which is obtained with a non-negative decomposition, contains more predictive information than those obtained with singular value decomposition or scalar value of the total Bitcoin transaction volume

arXiv.org e-Print Archive

Crossref

Boston University Institutional Repository (OpenBU)

Towards a comprehensive C-budgeting approach of a coccolithophorid bloom in the Northern Bay of Biscay (June 2006)

Author: Borges Alberto
Chou Lei
d'Hoop Quentin
De Bodt Caroline
Engel Anja
Groom Steve
Harlay Jérôme
Piontek Judith
Roevros Nathalie
Sabbe Koen
Suykens Kim
Van Oostende Nicolas
Publication venue
Publication date: 25/01/2009
Field of study

A biogeochemical multidisciplinary survey was carried out in the northern Bay of Biscay, in early June 2006, during which 14C-based primary production and calcification were determined as well as O2-based community respiration. Contemporary remote sensing images showed several patches of high reflectance (HR) in the investigated area. Based on remote sensing and in situ measured biogeochemical parameters, the area exhibited varying coccolithophorid bloom stages from its early development to the post-bloom stages. The major HR patch, characterizing a post-stationary stage of the bloom, was located between 48°N and 49°N over the shelf along the continental margin. It was associated with moderate chlorophyll-a levels, never exceeding 1.0 µg L-1, dissolved phosphorus and silica depletion, and undersaturation of CO2 with respect to atmospheric equilibrium. Considered as the main drivers of the C cycle in this area, the CO2 fluxes associated with primary production, calcification and respiration were integrated in order to provide a comprehensive C budget in the area

Open Repository and Bibliography - Liège

L2-norm multiple kernel learning and its application to biomedical data fusion

Author: A Daemen
A Daemen
Anneleen Daemen
AY Ng
B Schölkopf
Bart De Moor
C Bottomley
C Leslie
DMJ Tax
ED Andersen
FR Bach
G Condous
G Thomas
GC Cawley
GRG Lanckriet
GRG Lanckriet
J Gudmundsson
J Shawe-Taylor
JAK Suykens
JAK Suykens
Johan AK Suykens
JP Ye
K Tretyakov
K Veropoulos
Leon-Charles Tranchevent
M Grant
M Grant
M Kloft
M Kloft
M Kowalski
O Gevaert
R Hettich
R Reemtsen
RA Eeles
S Aerts
S Sonnenburg
S Yu
Shi Yu
SJ Kim
T De Bie
T van den Bosch
Tillmann Falck
V Vapnik
Y Zheng
Yves Moreau
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background This paper introduces the notion of optimizing different norms in the dual problem of support vector machines with multiple kernels. The selection of norms yields different extensions of multiple kernel learning (MKL) such as <it>L</it>∞, <it>L</it>1, and <it>L</it>2 MKL. In particular, <it>L</it>2 MKL is a novel method that leads to non-sparse optimal kernel coefficients, which is different from the sparse kernel coefficients optimized by the existing <it>L</it>∞ MKL method. In real biomedical applications, <it>L</it>2 MKL may have more advantages over sparse integration method for thoroughly combining complementary information in heterogeneous data sources. Results We provide a theoretical analysis of the relationship between the <it>L</it>2 optimization of kernels in the dual problem with the <it>L</it>2 coefficient regularization in the primal problem. Understanding the dual <it>L</it>2 problem grants a unified view on MKL and enables us to extend the <it>L</it>2 method to a wide range of machine learning problems. We implement <it>L</it>2 MKL for ranking and classification problems and compare its performance with the sparse <it>L</it>∞ and the averaging <it>L</it>1 MKL methods. The experiments are carried out on six real biomedical data sets and two large scale UCI data sets. <it>L</it>2 MKL yields better performance on most of the benchmark data sets. In particular, we propose a novel <it>L</it>2 MKL least squares support vector machine (LSSVM) algorithm, which is shown to be an efficient and promising classifier for large scale data sets processing. Conclusions This paper extends the statistical framework of genomic data fusion based on MKL. Allowing non-sparse weights on the data sources is an attractive option in settings where we believe most data sources to be relevant to the problem at hand and want to avoid a "winner-takes-all" effect seen in <it>L</it>∞ MKL, which can be detrimental to the performance in prospective studies. The notion of optimizing <it>L</it>2 kernels can be straightforwardly extended to ranking, classification, regression, and clustering algorithms. To tackle the computational burden of MKL, this paper proposes several novel LSSVM based MKL algorithms. Systematic comparison on real data sets shows that LSSVM MKL has comparable performance as the conventional SVM MKL algorithms. Moreover, large scale numerical experiments indicate that when cast as semi-infinite programming, LSSVM MKL can be solved more efficiently than SVM MKL. Availability The MATLAB code of algorithms implemented in this paper is downloadable from <url>http://homes.esat.kuleuven.be/~sistawww/bioi/syu/l2lssvm.html</url>.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Intelligent image-based colourimetric tests using machine learning framework for lateral flow assays

Author: Abuhassan
Achanta
Akraa
Alankus
Antesar M. Shabut
Arthur
Barbosa
Bourouis
Bradley
Carbonera
Chen
Contreras-naranjo
Cooper
Dang
Dhar
El-Bendary
Feng
Garg
Hussain
Janke
Jeannette Chin
Jonas
Kabir
Karlsen
Karlsen
Kettler
Khademhosseini
Khan
Khin T. Lwin
Kim
Kim
Koczula
Konnaiyan
Krizhevsky
Lopez-Ruiz
M.A. Hossain
Marzia Hoque Tania
Masawat
Mohammad Najlah
Mutlu
Otsu
Ozkan
Rahmat
Rajan
Rasmussen
Roda
Seo
Sergyan
Shabut
Shen
Sicard
Smith
Smith
Solmaz
Suykens
Suykens
Szegedy
Tania
Tania
Tania
Vashist
Wang
Wang
Wirth
Yetisen
Publication venue: 'Elsevier BV'
Publication date: 26/07/2019
Field of study

This paper aims to deliberately examine the scope of an intelligent colourimetric test that fulfils ASSURED criteria (Affordable, Sensitive, Specific, User-friendly, Rapid and robust, Equipment-free, and Deliverable) and demonstrate the claim as well. This paper presents an investigation into an intelligent image-based system to perform automatic paper-based colourimetric tests in real-time to provide a proof-of-concept for a dry-chemical based or microfluidic, stable and semi-quantitative assay using a larger dataset with diverse conditions. The universal pH indicator papers were utilised as a case study. Unlike the works done in the literature, this work performs multiclass colourimetric tests using histogram based image processing and machine learning algorithm without any user intervention. The proposed image processing framework is based on colour channel separation, global thresholding, morphological operation and object detection. We have also deployed a server based convolutional neural network framework for image classification using inductive transfer learning on a mobile platform. The results obtained by both traditional machine learning and pre-trained model-based deep learning were critically analysed with the set evaluation criteria (ASSURED criteria). The features were optimised using univariate analysis and exploratory data analysis to improve the performance. The image processing algorithm showed >98% accuracy while the classification accuracy by Least Squares Support Vector Machine (LS- SVM) was 100%. On the other hand, the deep learning technique provided >86% accuracy, which could be further improved with a large amount of data. The k-fold cross validated LS- SVM based final system, examined on different datasets, confirmed the robustness and reliability of the presented approach, which was further validated using statistical analysis. The understaffed and resource limited healthcare system can benefit from such an easy-to-use technology to support remote aid workers, assist in elderly care and promote personalised healthcare by eliminating the subjectivity of interpretation

Crossref

Anglia Ruskin Research

Teeside University's Research Repository

Research @Leeds Trinity University

University of East Anglia digital repository

Identification of Piecewise Affine Systems Using Sum-of-Norms Regularization

Author: Batruni
Bemporad
Bemporad
Breiman
Candès
Candès
Choi
Donoho
Falck
Ferrari-Trecate
Gad
Grant
Grant
Hastie
Julian
Julian
Kim
Kim
Lin
Lindsten
Löfberg
Martinetz
Nakada
Ohlsson
Ohlsson
Ohlsson
Ozay
Paoletti
Pucar
Roll
Roll
Suykens
Tibsharani
Vapnik
Vidal
Vidal
von Luxburg
Yuan
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

All-paths graph kernel for protein-protein interaction extraction with evaluation of cross-corpus learning

Author: A Airola
A Yakushiji
AB Clegg
Antti Airola
AP Bradley
C Giuliano
C Nédellec
CD Meyer
D Zelenko
Filip Ginter
J Björne
J Ding
J Heimonen
JA Hanley
JAK Suykens
Jari Björne
JD Kim
JG Caporaso
K Fundel
KB Cohen
L Hirschman
L Hunter
M Lease
M Miwa
MC de Marneffe
P Zweigenbaum
R Bunescu
R Bunescu
R Bunescu
R Rifkin
R Sætre
S Pyysalo
S Pyysalo
S Pyysalo
S Van Landeghem
Sampo Pyysalo
T Gärtner
T Mitsumori
T Pahikkala
T Pahikkala
Tapio Pahikkala
Tapio Salakoski
Y Miyao
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Rapid decline of the CO2 buffering capacity in the North Sea and implications for the North Atlantic Ocean

Author: A. E. Friederike Prowe
Alberto V. Borges
Anderson
Anderson
Bates
Borges
Bozec
Bozec
Corbière
de Haas
Delille
Dilling
Doney
Doney
Doney
Doney
Dore
Feely
Frankignoulle
Fung
Gruber
Hein J. W. de Baar
Helmuth Thomas
Inoue
Intergovernmental Panel on Climate Change (IPCC)
Ivan D. Lima
Johnson
Kim Suykens
Körtzinger
Lambert
Laure-Sophie Schiettecatte
Le Quéré
Lefèvre
Lenhart
Mathieu Koné
Mikaloff-Fletcher
Moore
Olsen
Omar
Orr
Polyakov
Revelle
Riebesell
Sabine
Sarmiento
Sarmiento
Schiettecatte
Scott C. Doney
Steven van Heuven
Takahashi
Takahashi
Takahashi
Thomas
Thomas
Thomas
Thomas
Thomas
Tsunogai
Winn
Wollast
Yann Bozec
Yeager
Publication venue: 'American Geophysical Union (AGU)'
Publication date: 01/01/2007
Field of study

Author Posting. © American Geophysical Union, 2007. This article is posted here by permission of American Geophysical Union for personal use, not for redistribution. The definitive version was published in Global Biogeochemical Cycles 21 (2007): GB4001, doi:10.1029/2006GB002825.New observations from the North Sea, a NW European shelf sea, show that between 2001 and 2005 the CO2 partial pressure (pCO2) in surface waters rose by 22 μatm, thus faster than atmospheric pCO2, which in the same period rose approximately 11 μatm. The surprisingly rapid decline in air-sea partial pressure difference (ΔpCO2) is primarily a response to an elevated water column inventory of dissolved inorganic carbon (DIC), which, in turn, reflects mostly anthropogenic CO2 input rather than natural interannual variability. The resulting decline in the buffering capacity of the inorganic carbonate system (increasing Revelle factor) sets up a theoretically predicted feedback loop whereby the invasion of anthropogenic CO2 reduces the ocean's ability to uptake additional CO2. Model simulations for the North Atlantic Ocean and thermodynamic principles reveal that this feedback should be stronger, at present, in colder midlatitude and subpolar waters because of the lower present-day buffer capacity and elevated DIC levels driven either by northward advected surface water and/or excess local air-sea CO2 uptake. This buffer capacity feedback mechanism helps to explain at least part of the observed trend of decreasing air-sea ΔpCO2 over time as reported in several other recent North Atlantic studies.S. Doney and I. Lima were supported by NSF/ONR NOPP (N000140210370) and NASA (NNG05GG30G)

OceanRep

Crossref

Proceedings - University of Groningen

Woods Hole Open Access Server

University of Groningen

ARTS repository - University of Groningen

Open Repository and Bibliography - Liège

Dissertations of the University of Groningen

Fast and scalable Lasso via stochastic Frank–Wolfe methods with a convergence guarantee

Author: AE Hoerl
B Efron
B Schölkopf
Claudio Sartori
Emanuele Frandi
F Pedregosa
H Zou
H Zou
J Friedman
J Friedman
J Friedman
JA Tropp
Johan A. K. Suykens
K Clarkson
M Frank
M Jaggi
M Lee
P Richtárik
Q Zhou
R Tibshirani
R Tibshirani
R Tibshirani
Ricardo Ñanculef
S Shalev-Shwartz
S Shalev-Shwartz
SJ Kim
Stefano Lodi
T Hastie
Y Nesterov
Z Harchaoui
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Big Data and Causality

Author: A Bate
A Casillas
A Fujita
A Montalto
A Sharma
AKH Tung
B Widrow
BJ Ale
BJM Ale
C Bizer
C Hashimoto
C Mihăilă
C Mihăilă
C Silverstein
CC Chang
CC Yang
CM Bishop
CW Granger
D Birant
D Xu
D Zhang
EA Wan
G Sugihara
G Wu
GF Cooper
GK Gupta
GP Zhang
H Chen
H Chen
H Hassani
H Hassani
H Hassani
H Ibrahim
H Kargupta
H Yang
H Yun
J Cowie
J Han
J Han
J Li
J Li
J Li
J Ma
J Pustejovsky
J Sadek
J Vohradský
JA Suykens
JB Classen
JD Kim
JR Quinlan
JR Quinlan
JR Sato
JV Tu
JW Hunt
JW Seol
K Fundel
L Breiman
L Sanmiquel
L Talmy
L Wang
M Collins
M Hall
M Herland
M Lagazio
M Wahde
MD Richard
N Rizzolo
P Langley
PA Shoemaker
R Agrawal
R Bunescu
R Maesschalck De
R Xu
S Karimi
S Kleinberg
S Lee
S Pyysalo
S Zhang
S Zhao
SC Chen
SH Chen
ST Li
TC Fu
U Fayyad
U Hahn
U Roshan
U Soytas
V Mayer-Schonberger
WW Chow
X Zhang
Y Ji
Y Ji
Y Ji
YL Hsieh
YL Hsieh
Z Ghodsi
Z Lin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/08/2017
Field of study

The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Causality analysis continues to remain one of the fundamental research questions and the ultimate objective for a tremendous amount of scientific studies. In line with the rapid progress of science and technology, the age of big data has significantly influenced the causality analysis on various disciplines especially for the last decade due to the fact that the complexity and difficulty on identifying causality among big data has dramatically increased. Data mining, the process of uncovering hidden information from big data is now an important tool for causality analysis, and has been extensively exploited by scholars around the world. The primary aim of this paper is to provide a concise review of the causality analysis in big data. To this end the paper reviews recent significant applications of data mining techniques in causality analysis covering a substantial quantity of research to date, presented in chronological order with an overview table of data mining applications in causality analysis domain as a reference directory

Crossref

De Montfort University Open Research Archive